MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.

نویسندگان

  • K Quandt
  • K Frech
  • H Karas
  • E Wingender
  • T Werner
چکیده

The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Single Nucleotide Polymorphisms and Association Studies: A Few Critical Points

Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...

متن کامل

Detection of Novel Mutation in LukS Panton-Valentine Leukocidin Gene in Twelve Isolates of Staphylococcus aureus from Sudanese Patients

Staphylococcus aureus carrying PVL gene remain major health problem associated with highly virulent infections. Characterization of such gene is important to know the impact and the functional significance of nucleotide variations. PCR and standard sequencing were performed for twelve Sudanese strains from different sources. Protein structures prediction, modeling and physiochemical analysis we...

متن کامل

The vlhA gene sequencing of Iranian Mycoplasma synoviae isolates

Mycoplasma synoviae expressed variable lipoprotein haemagglutinin (VlhA) is believed to play a major role in pathogenesis of the disease by mediating adherence and immune evasion. The aim of this study was sequencing Iranian M. synoviae isolates for the detection of nucleotide variation in the M. synoviae vlhA gene. Using oligonucleotide primers complementary to the single-copy conserved 5´ end...

متن کامل

MatInspector and beyond: promoter analysis based on transcription factor binding sites

MOTIVATION Promoter analysis is an essential step on the way to identify regulatory networks. A prerequisite for successful promoter analysis is the prediction of potential transcription factor binding sites (TFBS) with reasonable accuracy. The next steps in promoter analysis can be tackled only with reliable predictions, e.g. finding phylogenetically conserved patterns or identifying higher or...

متن کامل

Study on Genetic Diversity of Terminal Fragment Sequence of Isolated Persian Tobacco Mosaic Virus

Tobacco mosaic virus (TMV) is one of the devastating plant viruses in the world that infects more than 200 plant species. Movement protein plays a supportive role in the movement of other plant viruses, and viral coat protein is highly expressed in infected plants and affects replication and movements of TMV. In order to investigate genetic variation in the terminal fragment sequence in Iranian...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic acids research

دوره 23 23  شماره 

صفحات  -

تاریخ انتشار 1995